Date : 4/23/2020 4:06:56 PM 

From : "Yin (Whitney), Yuhui W." ywyin@utmb.edu 

To: "seanyu@epochlifescience.com" seanyu@epochlifescience.com 
Subject : Re: synthesize a clone 

Attachment : abb7498 Gao SM.pdf; 


Hi Sean, 


Let's make a native one first. Do you see problem in expressing the native sequence in E. 
coli? 
This is following a published protocol, attached. 


Thanks for you rapid reply 


From: Sean Yu <seanyu Q epochlifescience.com» 
Date: Thursday, April 23, 2020 at 3:48 PM 

To: Yuhui Yin <ywyin@UTMB.EDU> 

Subject: RE: synthesize a clone 


WARNING: This email originated from outside of UTMB's email system. Do not click links or open 


attachments unless you recognize the sender and know the content is safe. 


Hi Whitney, 


Do you need the native DNA sequence or you need codon optimization for E coli 
expression? Thanks 


Sean 


From: Yin (Whitney), Yuhui W. <ywyin@UTMB.EDU> 
Sent: Thursday, April 23, 2020 3:17 PM 

To: Sean Yu «seanyu 9 epochlifescience.com» 
Subject: synthesize a clone 


Hi Sean, 

Hope you are well. 

I would like to synthesize a gene for SARS-Cov-2 RNA 
polymerase. Specifically, COVID-19 virusnsp12 (GenBank: 
MN908947)gene was cloned into a modified pET-22a vector, 
with the C-terminus possessing a10x His-tag. 


Please let me know if this can be done quickly. 
Thanks! 


Whitney 


Whitney Yin 

Department of Pharmacology and Toxicology 
University of Texas Medical Branch 
BSB3.110, 301 University Blvd, 

Galveston, TX 77555 

TEL: 409-772-9631 

EMAIL: ywyin@utmb.edu 
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Movie S1 (mp4) 


Materials and Methods 


Protein production and purification 
The COVID-19 virus nsp12 (GenBank: MN908947) gene was cloned into a modified pET- 


22a vector, with the C-terminus possessing a 10 x His-tag. The plasmids were transformed into E. 
coli BL21 (DE3), and the transformed cells were cultured at 37 °C in LB media containing 100 
mg/L ampicillin. After the ODsoo reached 0.8, the culture was cooled to 16 °C and supplemented 
with 0.5 mM IPTG. After overnight induction, the cells were harvested through centrifugation, 
and the pellets were resuspended in lysis buffer (20 mM Tris-HCl, pH 8.0, 150 mM NaCl, 4 mM 
MgCb, 10% glycerol) and homogenized with an ultra-high-pressure cell disrupter at 4 °C. The 
insoluble material was removed through centrifugation at 12,000 rpm. The fusion protein was first 
purified by Ni-NTA affinity chromatography and then further purified by passage through a Hitrap 
Q ion-exchange column (GE Healthcare, USA) before loading onto a Superdex 200 10/300 
Increase column (GE Healthcare, USA) in a buffer containing 20 mM Tris-HCl, pH 7.5, 250 mM 
NaCl and 4 mM MgCl. Purified nsp12 was concentrated to 4 mg/mL and stored at 4 °C. 
Full-length COVID-19 virus nsp7 and nsp8 were co-expressed in E. coli BL21 (DE3) cells as a 
no-tagged protein and a 6 X His-SUMO fusion protein, respectively. After purification by Ni- 
NTA (Novagen) affinity chromatography, the nsp7-nsp8 complex was eluted through on-column 
tag cleavage by ULP protease. The eluate was further purified by Hitrap Q ion-exchange column 
(GE Healthcare, USA) and a Superdex 200 10/300 Increase column (GE Healthcare, USA) in a 
buffer containing 20 mM Tris-HCl, pH 7.5, 250 mM NaCl and 4 mM MgCh. 

For assembling stable nsp12-nsp7-nsp8 complex, purified nsp12 was incubated with nsp7 and nsp8 
at 4 ^C for three hours, at a molar ratio of 1: 2: 2 in a buffer containing 20 mM Tris-HCl, pH 7.5, 
250 mM NaCl and 4 mM MgCl. For the sample in reduced condition, the complex was further 
transferred to a reducing buffer containing 20 mM Tris-HCl, pH 7.0, 250 mM NaCl and 4 mM 
MgCb, 4 mM DTT using a centrifugal ultrafiltration device (Amicon@ Ultra Filters). 


Cryo-EM sample preparation and data collection 
In total, 3 uL of protein solution at 0.7 mg/mL (added with 0.025% DDM, both samples the 


same) was applied onto a H2/O2 glow-discharged, 300-mesh Quantifoil R1.2/1.3 grid (Quantifoil, 
Micro Tools GmbH, Germany). The grid was then blotted for 3.0 s with a blot force of 0 at 8°C 
and 100% humidity and plunge-frozen in liquid ethane using a Vitrobot (Thermo Fisher Scientific, 
USA). Cryo-EM data were collected with a 300 keV Titan Krios electron microscope (Thermo 
Fisher Scientific, USA) and a K2 Summit direct electron detector (Gatan, USA). Images were 
recorded at EFTEM with a 165000x magnification and calibrated super-resolution pixel size 0.82 
A/pixel. The exposure time was set to 5 s with a total accumulated dose of 60 electrons per A”. All 
images were automatically recorded using SerialEM (74). For Dataset-1, a total of 7,994 images 
were collected with a defocus range from 1.0 um to 1.8 um. For Dataset-2 (under reducing 
conditions), a total of 8494 images were collected with a defocus range from 1.1 jum to 2.0 um. 
Statistics for data collection and refinement are in Table S1. 


Cryo-EM image processing 

All dose-fractioned images were motion-corrected and dose-weighted by MotionCorr2 
software (/9) and their contrast transfer functions were estimated by cryoSPARC patch CTF 
estimation. For Dataset-1, a total of 2,334,248 particles were auto-picked using blob picker and 
extracted with a box size of 300 pixels in cryoSPARC (20). The following 2D, 3D classifications 
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and refinements were all performed in cryoSPARC. 918,133 particles were selected after two 
rounds of 2D classification. 100,000 particles were used to do Ab-Initio reconstruction in five 
classes, and then these five classes were used as 3D volume templates for heterogeneous 
refinement with all selected particles, with 110,176 particles converged into one class. Next, this 
particle set was used to perform homogeneous refinement, yielding a resolution of 3.1 Å. After 
local refinement, the final resolution reached 2.9 Å. For Dataset-2, the image processing was 
conducted using a similar pipeline. 753,481 particles were auto-picked initially, and 145,388 
particles were selected after final heterogeneous refinement. The resolution reached 2.99 Å after 
non-uniform refinement and 2.95 Å after local refinement with a mask. 


Model building and refinement 

To solve the structure of the COVID-19 virus nsp12-nsp7-nsp8 complex, the structure of the 
SARS-CoV nsp12 (9) and nsp7-8 (21) were individually placed and rigid-body fitted into the cryo- 
EM map using UCSF Chimera (22). After the corresponding amino acids were replaced with those 
from COVID-19 virus, the model was manually built in Coot 0.8 (23) with the guidance of the 
cryo-EM map, and in combination with real space refinement using Phenix 1.9 (24). The data 
validation statistics are shown in Table S1. 
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Fig. S1. 


The purification of COVID-19 virus nsp12 and nsp12-nsp7-nsp8 complex using a Superdex 
200 10/30 column. (A) Size-exclusion chromatogram of the affinity-purified the COVID-19 virus 
nsp12 and (B) nsp7-nsp8 complex. Fractions from the gel filtration peaks were pooled. The target 
proteins were analyzed by SDS-PAGE. Standard protein markers are shown in the first lane. 
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Fig. S2. 


Cryo-EM reconstruction. (A) Raw image of the nsp12-nsp7-nsp8 complex particles in vitreous 
ice recorded at defocus values of -1.0 to -1.8 um. Scale bar, 50 nm. (B) Power spectrum of the 
image shown in (A), with an indication of the spatial frequency corresponding to 3.0 À resolution. 
(C) Representative class averages. The edge of each square is 246 À. (D) The data processing 
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scheme. Overview of nsp12-nsp7-nsp8 reconstruction is shown in the bottom panel along with 
Local resolution. (E) Fourier shell correlation (FSC) of the final 3D reconstruction following gold 
standard refinement. FSC curves are plotted before and after masking. (F) Angular distribution 
heatmap of particles used for the refinement. (G-D Data processing procedure and corresponding 
results for Dataset-2 (collected under reducing conditions). 
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Fig. S3. 


Cryo-EM map of B-hairpin (A) and disulfide bonds (B). (A) Structure and representative map 
of B-hairpin and NiRAN. (B) Raw cryo-EM map (mesh) for the nsp12-nsp7-nsp8 complex is 
shown in magenta and red (COVID-19 virus) for Dataset-2 and Dataset-1, respectively or grey 
(SARS-CoV, EMD-0520). Structures near the disulfide bond region of the Interface domain 
(orange) and Fingers domain (deep blue) are shown as stick models. The corresponding region in 
SARS-CoV (PDB ID: 6NUR) is shown on the right. 
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Fig. S4. 


Structure comparisons. (A) Comparison of two nsp8 molecules bound to COVID-19 virus nsp12. 
The nsp8-1 (in red) refers to the individual nsp8 molecule bound to nsp12. The nsp8-2 (in green) 
refers to the nsp8 molecule in nsp7-nsp8 pair. (B) An overview of the complex showing how two 
nsp8 units bind with different conformations to nspl2. (C) The structure difference between 
COVID-19 virus nsp12-nsp7-nsp8 complex and SARS-CoV nsp12-nsp7-nsp8 complex (color by 
RMSD-Full in Chimera). (D) Comparison of COVID-19 virus nsp12 (in color) and HCV ns5b (in 
grey). The experimental EM map covering the active site of COVID-19 virus nsp12 is shown as 
mesh in the right panel. 
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COVID-19 Virus nsp12-nsp7-nsp8 complex COVID-19 Virus nsp12-nsp7-nsp8 complex 
(Dataset-1) under reducing condition (Dataset-2) 


| B-hairpin | Contact" | Target Residues | B-hairpin | Contact” | Target Residues 
R33 1,1 R33 1 


D126,K121 Y122 
A34 1,1 D126,A125 A34 1 D126 
D36 1,5,2 D208,Y728,S236 F35 1 D208 
Y38 2,1,2 Y728,H725,E729 D36 1,1,1 D208,Y728,L240 
N39 4 H725 Y38 1,1,1,3 Y728,H725,R733,E729 
A43 3 Q724 N39 2 H725 
G44 1 Y728 A43 3 Q724 
F45 1 5709 F45 1 L708 
A46 2 5709 A46 1 5709 
K47 2 Y129 F48 1 D711 
F48 2 D711 
Fig. S5. 


Interaction between the B-hairpin and other domains. "Numbers represent the number of atom- 
to-atom contacts between the residues of B-hairpin and the residues in other domains. These were 
calculated by the Contact program in the CCP4 suite (with a distance cutoff of 3.5 Å). 
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B 


L 1 I 
COVID-19 Virus nsp12 SADAQSFLNRVCGVSAARLTPCGTGTSTDVVYRAFDIYNDKVAGFAKFLKTNCCRFQEKDEDDNL IDSYFVVKRHTFSN 79 
SARS-CoV nspi2 SADAS TFLNRVCGVSAARLTPCGTGTSTDVVYRAFDIYNEKVAGFAKFLKTNCCRFQEKDEEGNLLDSYFVVKRHTMSN 79 
RaTG13 nsp12 SADAQSFLNRVCGVSAARLTPCGTGTSTDVVYRAFDIYNDKVAGFAKFLKTNCCRFQEKDEDDNL IDSYFVVKRHTFSN 79 


NIRAN 


[ 
COVID-19 Virus nsp12 YQHEETIYNLLKDCPAVAKHDFFKFRIDGDMVPHISRQRLTKYTMADLVYALRHFDEGNCDTLKEILVTYNCCDDDYFN 158 
SARS-CoV nsp12 voOHEETIYNLVKDCPAVAVHDFFKFRVDGDMVPHISRQRLTKYTMADLVYALRHFDEGNCDTLKE ILVTYNCCDDDYFN 158 
RaTG13 nsp12 vouEE TI YNLLKDCPAVAKHDFFKFRIOGDMVPHISRQRLTKYTMADLVYALRHFDEGNCDTLREILVTYNCCODDYFN 158 


NiRAN 
f : 
COVID-19 Virus nsp12 KkKDWYDFVENPDILRVYANLGERVRQALLKTVQFCDAMRNAGIVGVLTLDNQDLNGNWYDFGDFIQTTPGSGVPVVDS Y 237 
SARS-CoV nsp12 KKDWYDFVENPDILRVYANLGERVRQSLLKTVQFCDAMRDAGIVGVLTLDNQDLNGNWYDFGDFVQVAPGCGVP IVDSY 237 
RaTG13 nsp12 kKDWYDFVENPDILRVYANLGERVRQALLKTVQFCDAMRDAGIVGVLTLDNQDLNGNWYDFGDF IQTTPGSGVP Ivpsv 237 


Interface 
T J 


COVID-19 Virus nsp12 YSLLMPILTLTRALTAESHVDTDLTKPYIKWDLLKYDFTEERLKLFDRYFKYWDQTYHPNCVNCLDDRCILHCANFNVŁ 316 
SARS-CoV nsp12 YSLLMP ILTLTRALAAESHMDADLAKPL IKWDLLKYDFTEERLCLFDRYFKYWDQTYHPNCINCLDORCILHCANFNVL 316 
RaTG13 nsp12 ys LLMPILTLTRALTAESHVDTDLTKPY IKWDLLKYDFTEERLKLFDRYFKYWDOTYHPNCVNCLDORCILHCANFNVL 316 


Interface 
L ] 


COVID-19 Virus nsp12 FSTVFPPTSFGPLVRKIFVDGVPFVVSTGYHFRELGVVHNQDVNLHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTC 395 
SARS-CoV nsp12 Fs TVFPPTSFGPLVRKIFVDGVPFVVSTGYHFRELGVVHNQDVNLHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTC 395 
RaTG13nsp12 Es TvFPPTSFGPLVRKIFVDGVPFVVSTGYHFRELGVVHNQDVNLHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTC 395 


Fingers 


COVID-19 Virus nsp12 FSVAALTNNVAFQTVKPGNFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAISDYOYYRYNLPTMCDIRQLLFVVE 474 
SARS-CoV nsp12 FSVAALTNNVAFQTVKPGNFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAISDYDYYRYNLPTMCDIRQLLFVVE 474 
RaTG13 nsp12 FSVAALTNNVAFQTVKPGNFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAISDYDYYRYNLPTMCDIROLLFVVE 474 


Fingers 


COVID-19 Virus nsp12 VVDKYFDCYDGGC INANQV I VNNLDKSAGFPFNKWGKARLYYDSMSYEDQDALFAYTKRNVIPTITQMNLKYA SAKNR 
SARS-CoV nsp12 VVDKYFOCYDGGC INANQVIVNNLDKSAGFPFNKWGKARLYYDSMSYEDQDALFAYTKRNVIAT I TOMNLKYAISAKNR] 
RaTG13 nsp12 vypkvrpCYDGGC INANOVIVNNLDKSAGEPFNKWGKARLYYDSMSYEDODALFAYTKRNVIPILI TOMNLKYATSAKNR 

motif G motif F 


Fingers Palm Fingers 
bICSTMTNRQFHOKLLKS IAATRGATVVIGTSKFYGGWHNMLKTVYSDVENPHLMGWDYPKCDRAMPNMLR I 632 
BICSTMTNRQFHQKLLKS IAATRGATVVIGTSKFYGGWHNMLKTVYSDVETPHLMGWDYPKCDRAMPNMLR | 632 
BICSTMTNRQFHQKLLKS IAATRGATVVIGTSKFYGGWHNMLKTVYSDVENPHLMGWDYPKCDRAMPNMLR | 632 
motif A 


Fingers palm 


COVID-19 Virus nsp12 
SARS-CoV nsp12 
RaTG13 nsp12 


711 


711 
711 


COVID-19 Virus nsp12 MASLVLARKHTTCCSLSHRFYRLANECAQVLSEMVMCGGSLYVKPGGITSSGDATTAYANSVENICQAVTANVNALLS TD 

SARS-CoV nsp12 MASLVLARKHNTCCNLSHREYRLANECAQVLSEMVMCGGSLYVKPGG[SSGDATTAYANSVENICQAVTANVNALLSTD 

RaTG13 nspi2 MASLVLARKHTTCCSLSHRFYRLANECAQVLSEMVMCGGSLYVKPGGISSGDATTAYANSVFNICQAVTANVNALLSTD 
motif B 


Palm 


790 
790 
790 


COVID-19 Virus nsp12 GNK I ADKYVRNLOHRLYECLYRNRDVDTDFVNEFYAYLRKHFSMMILSDDAVVCEFN TYASQGLVAS IKNFKSVLYYQN 
SARS-CoV nsp12 GNK IADKYVRNLOQHRLYECLYRNRDVDHEFVDEFYAYLRKHFSMMILSDDAVVCYNISNYAAQGL VAS IKNFKAVLY YQN| 
RaTG13 nspí2 GNK IADKHVRNLOHRLYECLYRNRDVDTDFVNEFYAYLRK ESMMILSDDAVVCENSTYASQGLVAS IKNFKSVLYYQN 

motif C motif D 


Palm Thumb 


COVID-19 Virus nsp12 INVFMS EIAKCWTETDLTKGPHEFCSQHTMLVKQGDDYVYLPYPDPSRILGAGCFVDDIVKTDGTLMIERFVS LAIDAYPL 869 
SARS-CoV nsp12 [iV EMS EAKCWTETOLTKGPHEFCSQHTMLVKQGDDYVYLPYPDPSRILGAGCFYDDIVKTDGTLMIERFVSLAIDAYPL 869 
RaTG13 nsp12 Ny FMS EIAKCWTETDLTKGPHEFCSQOHTMLVKQOGDDYVYLPYPDPSRILGAGCFVDDIVKTDGTLMIERFVSLAIDAYPL 869 


motif D mo! 


Thumb 


COVID-19 Virus nsp12 TK HPNQEYADVFHLYLQY IRKLHDELTGHMLDMYSVMLTNDNTSRYWEPEFYEAMYTPHTVLOQ 932 
SARS-CoV nsp12 TKHPNOEYADVFHLYLQY IRKLHDELTGHMLDMYSVMLTNDNTSRYWEPEFYEAMYTPHTVLQ 932 
RaTG13 nsp12 TKHPNQEYADVFHLYLQY IRKLHDELTGHMLDMYSVMLTNDNTSRYWEPEFYEAMYTPHTVLO 932 


Fig. S6. 


Sequence alignment of nspl2 proteins encoded by COVID-19 virus, SARS-CoV and 
RaTG13. The residues with blue, light blue or white backgrounds indicate the identical, conserved 
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or non-conserved residues, respectively. Domain arrangement and key RdRp motis are 
highlighted using the same color scheme as in Fig. 1. 
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Fig. S7. 
Chemical structures of the prodruss of (A) remdesivir and (B) sofosbuvir. 
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Table S1. 
Cryo-EM data statistics. 


Movie S1. 
Experimental cryo-EM map of nsp12 N-terminal region (A4 to R118) 
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